{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A Simple Pytorch Tutorial\n",
    "======================"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "PyTorch is a neural network library in Python. Before we dive into PyTorch, let us first think of what functionalities a neural network library should support:\n",
    "\n",
    "* Implement a neural network $\\mathbf{y} = f_\\mathbf{\\theta}(\\mathbf{x})$ parameterized by $\\mathbf{\\theta}$\n",
    "* Support Inference: given an input example $\\mathbf{x}$, compute the output of the network $\\mathbf{y}$\n",
    "* Support training of the network givan a bunch of training data $\\langle \\mathbf{x}, \\mathbf{y} \\rangle$\n",
    "    - The library should support auto-differentiation: computing the gradient $\\frac{\\partial f_\\mathbf{\\theta}(\\mathbf{x})}{\\partial \\mathbf{\\theta}}$\n",
    "    - The library should also provide implementations of common optimizers (e.g., SGD, Adam) to update the network's parameter $\\mathbf{\\theta}$ via gradient descent\n",
    "    \n",
    "Pytorch provides receipies to do all of these (and more)!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us try to understand some basic concepts of PyTorch using a toy example. Assume the network network we'd like to implement is a simple linear model:\n",
    "$$ y = \\mathbf{w}^\\intercal \\mathbf{x} + b  $$\n",
    "where our model's parameter $\\mathbf{\\theta} = \\langle \\mathbf{w}, b \\rangle$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import torch.nn as nn  # where most neural network modules are"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could implement our simple neural network by implementing the (abstract) class ``nn.Module``. Basically, what we are going to do is\n",
    "\n",
    "* Create two model parameters, $\\mathbf{w}$ and $b$.\n",
    "* Define the computation routine of the network: how $y$ is computed given $\\mathbf{x}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MyNeuralNet(nn.Module):\n",
    "    def __init__(self):\n",
    "        super(MyNeuralNet, self).__init__()\n",
    "        \n",
    "        # $w$ is 2-dimentional a vector with default value [1.0, 2.0]\n",
    "        self.w = nn.Parameter(\n",
    "            torch.tensor([1.0, 2.0])\n",
    "        )\n",
    "        # b is a scalar with default value 0.5\n",
    "        self.b = nn.Parameter(\n",
    "            torch.tensor(0.5)\n",
    "        )\n",
    "        \n",
    "    def forward(self, x):\n",
    "        \"\"\"The forward pass\"\"\"\n",
    "        y = self.w.dot(x) + self.b\n",
    "        \n",
    "        return y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Having defined our simple ''neural net'', now let's create one instance and run it!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The output `y` is 1.600000023841858\n"
     ]
    }
   ],
   "source": [
    "f = MyNeuralNet()\n",
    "x = torch.tensor([0.5, 0.3])\n",
    "y = f(x)\n",
    "\n",
    "print(f'The output `y` is {y}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We then call ``.backward()`` on the output of the nerual network ``y`` to perform back propagation, which compuates the gradients w.r.t model parameters $\\mathbf{\\theta} = \\langle \\mathbf{w}, b \\rangle$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "y.backward()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get the gradients of model parameters $\\mathbf{\\theta} = \\langle \\mathbf{w}, b \\rangle$, simply check the ``.grad`` property of model parameters:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "gradient of $w$: tensor([0.5000, 0.3000])\n",
      "gradient of $b$: 1.0\n"
     ]
    }
   ],
   "source": [
    "print(f'gradient of $w$: {f.w.grad}')\n",
    "print(f'gradient of $b$: {f.b.grad}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could do a sanity check to make sure the results are correct:\n",
    "$$\\frac{\\partial y}{\\partial \\mathbf{w}} = \\mathbf{x} \\quad \\frac{\\partial y}{\\partial b} = 1.0$$"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
